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Preface 


Legibility  Is  but  one  aspect  of  pattern  recognition,  but  It  Is 
of  fundamental  Importance.  With  good  legibility,  pattern  decisions 
are  accurate  and  efficient.  This  thesis  Investigates  the  use  of  the 
two-dimensional,  Fourier  spatial  frequency  components  to  design  36 
legible,  human  recognisable,  alphanumeric  symbols. 

I am  Indebted  to  Dr.  Matthew  Kabrtsky,  Professor  of  Electrical 
Engineering,  Air  Force  Institute  of  Technology,  for  his  aid  and  the 
wide  latitude  he  allowed  In  the  conduct  of  his  Investigation.  I am 
deeply  grateful  to  the  sponsor  of  this  project.  Dr.  Larry  G.  Goble, 
Flight  Dynamics  Laboratory,  who  gave  time,  understanding  and  technical 
assistance. 

I wish  to  thank  Dr.  Roger  Gagnon,  Staff  Development  Engineer, 
6570th  Aerospace  Medical  Research  Laboratory,  for  making  the  facilities 
of  the  laboratory  available  to  me  and  assisting  me  In  their  use. 

Harvey  D.  Dahljelm 
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Abstract 

Legibility  Is  of  fundamental  Importance  In  pattern  recognition. 
The  legibility  of  five  alphanumeric  sets  was  predicted  by  using  the 
maximum-minimum  Euclidean  distance  of  separation  In  a transform 
feature  space  established  by  the  truncated,  two-dimensional , discrete 
Fourier  spatial  frequency  components  of  the  symbols.  ASCII.  NAMEL. 
Huddleston.  Llncoln/Mltre  and  a combination  set  were  psychophysical ly 
tested  and  ranked  according  to  the  least  number  of  human  errors. 

Test  results  confirmed  the  rank  order  of  the  legibility  prediction: 
combination,  Llncoln/Mltre,  Huddleston,  NAMEL  and  ASCII.  For  a 
symbol  pair,  as  the  distance  of  separation  Increased  the  number  of 
errors  decreased  and  the  majority  of  the  errors  occurred  with  the 
five  nearest  symbols  to  the  confused  symbol.  An  alphanumeric  set 
was  designed  with  a predicted  legibility  greater  than  the  test  sets. 

A numeric  set  was  designed  with  a predicted  legibility  greater  than 
the  Llncoln/Mltre,  Mound,  Mackworth,  Lansdell,  NAMEL,  Huddleston  or 
ASCII  digits. 
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INVESTIGATION  OF  ALPHANUMERIC  SYMBOL 


LEGIBILITY  DETERMINATION  BY  USE  OF 


FOURIER  SPATIAL  FREQUENCY  COMPONENTS 


I.  Introduction 


In  order  to  speed  visual  communications , each  character  or  symbol 
must  be  readily  distinguished  from  all  others  or  else  mlsldentlf Icatlon 
and  confusion  might  occur  and  Interfere  with  communication.  The  visual 
symbol  Identification  problem  Is  not  only  a human  problem,  but  also  a 
machine  problem. 

One  concept  that  might  lead  to  the  development  of  a legible, 
human  recognizable,  alphanumeric  symbol  set  for  man  and  machine 
communication  Involves  the  use  of  Fourier  spatial  frequency  components 
(FSFC)  and  the  distance  of  separation  between  symbols  In  the  FSFC 
space.  Fourier  analysis  can  be  used  to  describe  a one -dimensional 
function  in  terms  of  the  frequency  components  of  the  function.  In 
a similar  manner,  the  frequency  components  of  a two-dimensional 
function  can  be  found  by  Fourier  analysis  and  used  to  describe  that 
function.  A symbol  can  be  considered  to  be  a two-dimensional  visual 
image.  The  spatial  frequency  components  of  a two-dimensional  visual 
image  can  be  found  by  Fourier  analysis  (Ref  1:20)  and  used  to  classify 
visual  Images  (Ref  1:1).  In  order  to  describe  a function  exactly,  an 
Infinite  number  of  frequency  components  may  be  required;  however,  the 
function  can  be  approximated  by  using  the  fundamental  low  frequency 
components  and  deleting  the  high  frequency  components  that  provide 


only  the  fine  detail  of  the  function. 
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The  purpose  of  this  work  was  to  examine  the  feature  space  established 
by  the  FSFC  basis  vectors  and  design  a human  recognisable,  36  symbol, 
alphanumeric  set  that  was  legible.  In  1975,  Vanderkolk,  Herman  and 
Hershberger  conducted  an  extensive  literature  survey  of  symbol  legibility 
and  found  legibility  In  terms  of  behavioral  data:  color,  contrast, 

shape,  active  area,  viewing  angle,  orientation  and  vibration  (Ref  2:80). 

In  light  of  the  ability  to  classify  visual  Images  by  FSPC  (Ref  1:1), 
slight  manipulations  of  the  FSFC  of  a symbol  should  not  drastically 
change  the  form  of  a symbol,  but  might  make  a symbol  more  legible  and 
further  separated  from  the  rest  of  the  symbols  In  the  human  perception 
space.  An  extensive  historical  review  and  a summary  of  the  mathematical 
concepts  Involved  In  symbol  legibility  have  been  published  by  these 
authors  and  only  a brief  summary  will  be  presented  here.  The  object  of 
this  thesis  was  to  design  an  alphanumeric  symbol  set  using  FSFC  and 
maximise  the  intersymbol  minimum  distance  of  separation;  to  examine  the 
distance  of  separation  of  four  previously  known  legible  alphanumeric 
sets,  and  test  the  legibility  of  a constructed  set  against  fhe  predicted 
legibility  order  of  the  four  known  sets. 

The  problem  is  analysed  in  terms  of  pattern  recognition,  discrete 
Fourier  transformations,  FSFC  feature  space  operations,  legible  symbols, 
and  assumptions  required  for  analysis.  The  symbol  data  are  generated 
and  a resultant  symbol  set  designed  and  tested.  Finally,  results  and 
conclusions  are  given  and  followed  by  recommendations  to  clairfy  the 
legibility  problem. 
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II.  Problem  Analysis 

In  this  chapter  the  background  material  Is  reviewed  In  sections 
on:  pattern  recognition,  discrete  Fourier  transformations,  feature 

space  operations,  symbol  legibility  and  problem  analysis  assumptions. 

Pattern  Recognl tl on 

In  1969,  Tallman  (Ref  1:36)  demonstrated  that  the  Inner  seven- 
by-seven  (third  harmonic)  FSFC  terms  from  the  two-dimensional  Fourier 
transformation  of  English  symbols  could  be  used  to  Identify  ( 95.8 
percent  correct)  such  symbols.  In  1973,  Ullmann  (Ref  3:292-299) 
described  pattern  recognition  by  Fourier  optics  using  Fourier 
transformations  by  Fraunhofer  diffraction  of  the  alphanumeric 
symbols  A through  Z and  the  numeric  digits  0 through  9. 

Optical  Fourier  transformation  by  Fraunhofer  diffraction  can 
be  shown  In  photographs  that  record  the  Intensity  of  light  in  the 
Fourier  plane  (Ref  3:296).  The  Intensity  at  a point  In  the  photographic 
plane  Is  equal  to  the  square  root  of  the  sum  of  the  squares  of  the 
real  and  Imaginary  part  of  the  Fourier  component  at  the  point  In 
the  Fourier  plane.  The  alphabetic  symbols  A and  B Fraunhofer 
diffraction  Intensity  patterns  are  shown  In  black  in  Figure  1, 

Fraunhofer  diffraction  Intensity  patterns,  on  the  next  page. 

By  observation  of  Intensity  regions  the  the  Fourier  plane,  pattern 
recognition  of  alphanumeric  symbols  can  be  made. 
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Dlscre f Fourier  Transformation 


The  two-dimensional,  discrete  Fourier  transformation  of  a 

M-by-N  dimension  symbol  array,  A_  . may  be  written 

nif  n 

N M 

ap,q  "EE  \.,n  *Xpl  "J2tT  [("P/M)  ♦ } (t) 

n-1  m-1 

where  p is  a modulo-M  index  and  q is  a modulo-N  index  in  the  Fourier 

plane.  The  digital  computation  is  rapidly  calculated  by  the  Fast 

Fourier  Transform  developed  by  Cooley-Tukey  (Ref  4).  A change  of 

one  point  in  the  A_  _ array  changes  all  a values. 

”in  p , q 

The  inverse  Fourier  transform  exists  such  that 
N-1  M-i 

\,,n  " TO  E E ap.q  *XP  { +J2tt  [(mp/M)  ♦ (nq/N)]}  (2) 

q-0  p-0 

Feature  Space  Operations 

A symbol  feature  space  is  a N-dlmenslonal  space  that  contains 
the  N component  feature  vectors  that  describe  a symbol.  A feature 
space  can  be  constructed  from  a set  of  orthogonal  basis  vectors. 

Each  FSFC  can  be  a feature  vector  and  each  FSFC  value  a measure 
along  that  vector.  All  FSFC  of  a symbol  indicate  the  location  of 
the  symbol  in  the  feature  space.  Each  symbol,  approximated  by  Its 
low-frequency  FSFC  terms,  can  be  represented  by  a N dimension  vector 
and  it  will  lie  at  some  location  in  the  space.  Several  similar 
symbols  will  cluster  in  the  region  of  the  symbol  they  are  similar  to. 

In  such  a space,  a variety  of  operations  may  be  performed: 
statistical  measures  can  be  made  on  the  N vector  components,  linearly 
independent  vector  sets  may  be  used  to  construct  an  orthogonal  vector 
set,  and  distances  can  be  determined  between  vectors. 
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Statistical  Measures.  A statistical  measure  can  be  the  mean 


or  the  variance  of  some  values  of  a vector  component.  The  mean  of 
N samples  Is  the  sum  of  the  N sample  values  divided  by  N.  The 
variance  la  the  sum  of  the  N sample  values  squared  minus  the  squared 
sum  of  the  sample  values  divided  by  the  number  of  samples.  N.  The 
mean-to-varlance  ratio  la  the  mean  divided  by  the  variance. 

Scalar  Product.  The  scalar  product  of  two  vectors,  x and  y.  Is 
denoted  <X,y>  . The  scalar  product  operation  Is  the  summation  of 
the  product  of  each  component  value  In  the  vector  multiplied  by  the 
corresponding  component  value  in  the  other  vector.  Two  vectors  are 
said  to  be  orthogonal  If  their  scalar  product  Is  equal  to  zero. 

Gram-Schmldt  Orthogonal lsatlon  Process.  In  an  N-dlmens lonal 
space,  for  M less  than  N,  a s«*t  of  K linearly  independent  vectors,  x^, 
can  be  processed  by  the  Gram-Schmldt  orthogonallzation  process  to 
construct  an  orthogonal  set  of  M linearly  Independent  vectors,  y^, 

A linearly  independent  vector  of  N components.  Is  a vector  that  Is 
not  the  sum  of  the  product  of  constants  thlmes  the  other  M-l 
vectors.  The  Gram-Schmldt  orthogonallzation  process  Is 


J-l 

yj  - Xj  and  y.  - X.  - £ *^1.  xi>  y j - 2,  3,  ....  M (3) 

i-i  <yj,  yi>  1 


Distance.  It  Is  easy  to  define  "distance"  In  a space.  In  an 
orthogonal,  N-dlmenslonal  space,  the  Euclidean  distance  of  separation 
may  be  written  r 

<*00  - y £ <ai  - M2  - %<x  <4) 

where  dj  and  are  the  1-th  component  values  of  the  vector  a *nd  p. 
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Legibility 

Symbol  pair  legibility  la  the  capability  of  one  aymbol  being 
distinguished  from  another  symbol.  Using  distance  of  separation. 
Identical  symbols  have  sero  distance  of  separation  because  a equals 
0 in  Eq  4.  A slight  form  variation  from  the  first  symbol  will  result 
In  a non-zero  distance  of  separation.  The  smaller  the  distance  In 
the  space,  the  more  similar  the  symbol;  the  larger  the  distance,  the 
greater  the  legibility  of  one  symbol  with  respect  to  the  other. 

In  the  1950’s,  legibility  research  on  the  optimal  design  of 
alphanumeric  symbols  led  to  the  development  of  the  NAMEL  symbol  (Ref 
2:81)  and  the  Lansdell,  Mound,  and  Mackworth  digits  (Ref  5:78).  The 
Lincoln/Mitre  symbols  were  developed  In  1966  (Ref  2:85).  Alphanumeric 
symbol  sets  were  developed  by  Huddleston  and  the  American  Standard  Code 
for  Information  Interchange  (ASCII).  In  1975,  Kabler  digitized  150 
different  legible  alphabets  of  26  letters  each  (Ref  6),  but  did  not 
study  them  for  legibility. 

Assumptions 

ASCII,  Huddleston,  Llncoln/Mltre,  and  NAMEL  alphanumeric  aymbol 
set  were  assumed  to  be  legible  and  the  continuous  two-dimensional 
solid  alphanumeric  symbols  were  assumed  to  be  adequately  represented 
by  a discrete,  binary,  two-dimensional  array  with  dimensions:  42 

units  high  and  30  units  wide.  To  avoid  the  aliasing  and  leakage 
problems  of  the  Fourier  transform  computation,  each  symbol  was 
centered  and  lmbeded  In  a zeroed  6A-by-64  unit  array. 

The  low-frequency.  Inner  seven-by-seven,  FSFC  (Eq  1)  adequately 
describe  the  alphanumeric  symbol  In  a 49-dlmenstonal  feature  space 
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and  a symbol  can  be  constructed  front  the  Fourier  components  (Eq  2). 
Two  different  symbols  are  separated  In  FSFC  space  and  Euclidean 
distance  (Eq  4)  was  assumed  to  be  an  adequate  metric  for  legibility 


The  site,  height  and  width,  of  the  designed  symbols  were  not 
allowed  to  change  from  symbol  to  symbol  or  symbol  set  to  symbol  set. 
Symbol  lines  also  were  not  allowed  to  change  In  Intensity,  nor  the 
two  unit  line  width  by  more  than  one  unit. 
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III.  Data  Generation  and  Testing 

The  data  generation  and  testing  chapter  includes  sections 
on:  flying-spot  scanner  data.  Kabler  data,  constructed  data, 

and  the  testing  procedure  used. 

| 

Flying-spot  Scanner  Data 

The  ASCII.  Huddleston.  Lincoln/Mitre  and  NAMEL  alphanumeric 
sets  had  each  symbol  drawn  on  a 14  unit  high  by  10  unit  wide  grid. 

Each  unit  was  a dot  and  the  dot  representation  is  shown  in  Figure  2 
on  the  next  page.  Each  symbol  was  digitised  by  the  6370th  Aerospace 
Research  Laboratory's  Systesi  Research  Laboratories  Flying-spot 
scanner  under  the  control  of  a Digital  Equipment  Corporation  PDP-12 
digital  computer.  The  two-dimensional  FSFC  were  calculated  on  the 
Uright-Patterson  AFB,  Computer  Center  Control  Data  6600  computer. 

The  FSFC  terms  were  computed  by  Dr.  Roger  A.  Gagnon's  PREVIP  program 
(Ref  7);  the  unit  energy  normalised  components  were  low-pass  filtered 
to  the  inner  seven-by-seven  terms;  the  Intersymbol  distance  of 
separation  matrix  computed,  and  for  each  symbol,  the  other  symbols  were 
rank  ordered  with  increasing  distance  of  separation  from  that  symbol. 

The  digitised  data  was  not  identical  to  the  input  symbols,  see 
Figure  3.  The  ragged  edges  of  the  circular  dots  and  stray  noise  dots 
were  removed  by  manual  intervention  and  internal  areas  with  holes 
were  made  solid  with  the  addition  of  missing  units  in  the  64-by-64 
array.  Of  course,  the  change  of  even  one  unit  of  the  64-by-64  unit 
array  of  the  symbol  alters  the  energy  in  the  symbol  and  changes  all 
the  FSFC  terms.  The  FSFC  terms  were  calculated  and  the  distance  of 
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Tig,  2,  Dot  symbol  Input  to  Flying-spot  scanner. 
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separation  matrix  was  computed.  The  distances  of  separation  Increases 

and  decreases  were  recorded.  Not  all  holes  were  filled  and  dots  were 

st. 

enlarged  or  decreased;  with  developed  symbols  resembling  Braille  (•*) 
or  the  American  Banking  Association  check  digits  (2)*  The  direction 
of  change  of  the  FSFC  terms  was  recorded. 

Six  modification  calculations  were  made.  Statistics  on  the  FSFC 
terms  for  each  symbol  were  recorded.  The  scalar  product  matrix  for 
all  symbol  pairs  was  calculated. 

The  36-by-36  symbol  scalar  product  matrix  was  used  to  construct 
a Gram-Schmidt  orthogonal  set  of  new  symbol  vectors  by  Eq  3.  Three 
sequences  were  used  to  construct  the  orthogonal  new  symbol  vectors 
from  the  FSFC  terms:  L)  the  sequential  sequence  A through  Z and  0 

through  9;  2)  the  frequency  of  usage  In  the  English  language  E,  0. 

1*  2.  3.  4.  5,  6,  7.  8.  9,  T.  R.  I.  N*  0.  A,  S.  D,  L.  C,  H,  Ft  U* 

P,  M,  Y,  G,  W,  V,  B,  X,  K,  Q,  J and  Z,  and  3)  the  smallest  scalar 
product  pairs:  Z,  6,  H,  I.  1.  T,  0,  0,  L,  J.  C,  E,  G,  W,  2.  5,  S, 

B,  D,  A,  M,  N,  Q,  8,  U,  7,  F,  K,  3,  V,  X,  Y,  4,  R,  P and  9.  The  new 
FSFC  symbols  were  printed  by  PREV1P,  see  Figure  4 for  the  symbol  6 
that  was  produced  from  the  third  sequence.  The  amount  and  direction 
of  change  was  recorded. 

Kabler  Data 

The  small  number  of  flying-spot  samples  was  Insufficient  to 
calculate  statistical  measures  on  the  FSFC  terms,  but  in  1973  Kabler 
produced  3900  symbols  digitized  In  a 32-bv-32  array  format  (Ref  6). 

The  150  different  sets  of  26  alphabetic  symbols  flying-spot  data  was  | 

processed  by  a modified  PREVIP  program  and  the  49  FSFC  terms  recorded. 
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The  modification  was  the  increasing  of  the  width  of  narrow  symbols; 
the  helght-to-wldth  ratio  of  each  symbol  was  reduced  to  less  than 
three-to-one.  The  symbol  I was  normalized  to  six-to-one  ratio.  The 
height  of  the  symbols  was  constant,  but  the  width  of  the  symbol  varied, 
ihe  helght-to-wldth  ratio  was  normalized  because  a unit  rectangular 
signal  has  a smaller,  but  wider  spectrum  than  a two  unit  rectangular 
signal  that  has  a larger,  narrow  transform.  The  FSFC  population 
mean,  variance  and  mean-to-varlance  ratio  statistics  were  computed 
for  each  of  the  49  components  of  each  symbol.  The  statistical 
computations  were  recalculated  after  the  49  components  were 
normalized  by  the  first  (DC)  harmonic  term. 

The  Fraunhofer  diffraction  patterns  of  Figure  1 can  be  determined 
from  the  squares  of  the  FSFC.  The  Kabler  FSFC  terms  were  squared  and 
the  statistics  calculated  on  the  squared  data  and  the  squared  DC 
normalized  data. 

Constructed  Data 

The  flying-spot  data  was  constructed  using  non-connective 
circular  dots.  The  constructed  data  was  developed  with  connecting 
square  units  to  produce  a solid  figure,  see  Figure  5 on  the  next  page 
for  sample  letters  of  the  Lincoln/Ml tre  symbols  and  Appendix  A,  B,  C, 

D and  E for  the  other  symbols.  The  flying-spot  computations  were 
repeated. 

An  experimental  alphanumeric  symbol  set  was  designed  using  the 
Kabler  data  statistical  component  Information,  observations  of  the 
inverse  Fourier  transformations  of  the  modified  flying-spot  FSFC, 
and  unit  block  movements  in  the  14-by-lO  unit  symbol  array. 
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A 144  symbol,  intersymbol  distance  of  separation  matrix  was 
calculated  for  ASCII,  Huddleston,  Llncoln/Mi tre  and  NA.MEL  symbol 
sets,  A combination  set  was  developed  by  a manual  iterative  search 
procedure  to  maximize  the  minimum  intersymbol  distance  of  separation 
over  the  entire  set.  The  Iterative  procedure  was,  a set  was  selected, 
the  symbol  pair  with  the  minimum  distance  of  separation  was  determined, 
either  symbol  was  substituted  to  develop  a new  set  and  the  procedure 
repeated.  The  experimental  symbols  were  added  and  the  resultant  180 
symbol,  180-by-180  distance  matrix  was  searched  for  a maximum- 
minimum  distance  combination  set.  See  Figure  6 for  procedure, 

A 144  symbol.  Intersymbol  distance  of  separation  matrix  for 
the  experimental  set;  Lincoln/Mitre  set;  the  ASCII,  Huddleston, 
Lansdell,  Mackworth,  Mound  and  NAMEL  digits,  was  calculated. 

Testing 

The  testing  of  the  generated  alphanumeric  sets  was  Independently 
conducted  by  Dr.  Larry  G.  Goble  of  the  Flight  Dynamics  Laboratory, 
Wright-Patterson  AFB,  Ohio,  The  psychological  legibility  testing 
had  the  observer  subject  fixate  on  a dot;  a mask  of  a dot  pattern 
was  exlblted  for  13  milliseconds;  the  mask  was  removed  for  five 
milliseconds;  the  alphanumeric  symbol  exlblted  for  15  milliseconds; 
the  symbol  removed  for  5 milliseconds;  the  dot  mask  displayed  for 
IS  milliseconds,  and  the  subject  Identified  the  observed  symbol. 

See  Figure  7 for  the  time  sequence  of  the  psychophysical  human 
perception  testing. 

Over  five  days,  9,000  symbols  were  observed  by  five  subjects 
and  a confusion  matrix  of  the  exlblted  symbol  versus  the,  Identified 
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symbol  was  constructed.  On  the  third  day,  the  symbol  exlbltlon 
time  was  reduced  to  10  milliseconds  to  avoid  saturation  at  100 
percent  correct  symbol  selection. 

The  36  symbol  alphanumeric  sets  of  ASCII,  Huddleston,  Lincoln/ 

Mitre,  NAMEL,  and  a test  set  of  the  F,  M,  R,  T,  0,  2 of  the  ASCII  set; 
the  C,  E,  K symbols  of  the  NAMEL  set;  the  ft,  N,  P,  S,  W,  8 symbols 

of  the  Huddleston  set;  the  A,  D,  G,  H,  I,  J,  L,  0,  Q,  V,  X,  Y,  Z, 

0,  1,  3,  4,  5,  6,  7 and  9 symbols  of  the  Lincoln/Ml tre  set,  were  the 
symbols  of  the  test  set  used  In  the  psychophysical  testing. 

The  combination  set  of  the  four  sets:  ASCII  symbols  A,  F,  M, 

R,  T,  U,  Z,  2;  Huddleston  symbols  B,  N,  P,  S,  W,  8;  NAMEL  symbols 
C,  E,  K;  and  Lincoln/Ml tre  symbols  A,  D,  G,  H,  I,  J,  L,  0,  Q,  V,  X, 

Y,  0,  1,  3,  4,  5,  6,  7 and  9,  was  not  tested.  The  combination  set 

of  the  five  sets:  ASCII  symbols  H,  L,  M;  NAMEL  symbols  C,  K,  6; 

Huddleston  symbols  B,  N,  S,  W;  Lincoln/Ml tre  symbols  A,  D,  J,  0,  Q, 

V,  Y,  Z,  0,  3,  7,  9;  experimental  symbols  E,  F,  G,  I,  P,  R,  T,  U, 

X,  1,  2,  4,  6 and  8,  was  not  tested. 


IV.  Findings 


The  findings  chapter  Includes  two  sections:  results  and 

conclusions.  The  Investigation  was  Initiated  1 August  1976  and 
concluded  on  1 November  1976. 

Results 

The  results  of  this  Investigation  are  reported  In  four  sections: 
flying-spot  data,  Kabler  data,  constructed  data,  and  psychophysical 
testing. 

Flying-spot  Data.  The  flying-spot  data  49  FSFC  terms  and  the 
distance  of  separation  matrix  were  computed.  See  Figure  8 on  the  next 
page  for  an  example  of  a distance  matrix  computer  printout  and  the 
Increasing  ordering  for  each  svmbol.  The  minimum  distance  of  separation 
Is  presented  below  in  table  I for  noisy  data  and  Table  II  for  reduced 
noise  data.  The  ASCII  C-G  pair  was  the  minimum  pair  for  both  runs  and 
the  NAMEL  H-I  pair  the  maximum,  even  though  the  distances  changed. 

Table  I 

Noisy  Flying-spot  Data. 


■ 

I 


i 


Symbol  Set 

Minimum 

Maximum  | 

Pair 

Distance 

Pair 

Distance 

NAMEL 

C-G 

.2401 

H-I 

1.0920 

ASCII 

C-G 

.2336 

F-J 

1.0640 

Llnceln/Mltre 

P-R 

.3304 

H-l 

.9412 

Huddleston 

1-1 

.3242 

L-7 

.9483 

Table  II 

Reduced  Noise 

Flying- 

spot  Data 

• 

Symbol 

Minimum 

Maximum  1 

Set 

Pair 

Distance 

Pair 

Distance 

NAMEL 

C-G 

.2404 

H-l 

1.1370 

ASCII 

C-G 

.2352 

F-J 

1.0640 

Llncoln/Mltre 

K-X 

.2956 

H-l 

.9905 

Huddleston 

1-1 

.3206 

U-l 

.9656 
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A 

OISTANCE  MftTRIY 
8 C D E F 

G 

H 

A 

0.9000 

.5321 

.6356  .5107  .5843  .6116 

.5232 

.6310 

B 

.53210.0990 

.6485  .3112  .3906  .5513 

. 4793 

.6121 

* 

.6355 

•54850. 0000  .6432  .4602  .6567 

. 5011 

.7464 

0 

.5107 

• 3112 

.64320.0000  .4695  .5545 

. 4466 

.6611 ' 

E 

.5843 

.3906 

.4602  .46953.0000  .4748 

. 4814 

.6314 

c 

.6115 

. 5513 

.6567  .5545  .47480.0000 

. 4980 

.4900 : 

•9 

.5232 

.4793 

.5011  .4466  .4914  .49800.0000 

.6519- 

H 

• 6310 

.6121 

.7464  .6611  .6314  .4900 

. 65190. 0000  * 

I 

.6808 

.5391 

.6398  .6639  .5007  .7439 

.6194 

.8035! 

J 

.7360 

.5704 

.8120  .6424  .7055  . 9025 

. 7048 

.7583 

< 

.5172 

.5441 

.5986  .5771  ,4955  .4782 

. 5313 

.5080 

L 

.8867 

. 7783 

.6857  .7242  .5847  .7195 

. 7257 

.6489 

.5751 

.5423 

.6863  .5135  .5127  .5963 

.6731 

.6095 

* 

.6131 

. 5393 

.6415  .5799  .5912  .5932 

.5613 

.4043 

.5944 

.5053 

.4518  .5014  .4551  .6063 

. 4252 

.5996 

0 

.6660 

.5897 

.7324  .6067  .5666  .2985 

.6190 

.4570 ’ 

a 

.5117 

.5062 

.5992  .5990  .5273  .5540 

. 4962 

.5490 

.5818 

.5106 

.6236  .5586  .5368  .4204 

. 5219 

.4495 

'S 

.486  0 

.4515 

.5676  .5017  .5058  .6306 

. 3747 

.6880 

r 

.6839 

.5988 

.6440  .6246  .5340  .6949 

. 6013 

.9092 

.7 1?6 

• 5850 

.6079  .5837  .5134  .7078 

.640  2 

.6238 

\i 

.6877 

.5130 

.6697  .5372  .5772  .6304 

. 5323 

.5425 

M 

.5112 

.4344 

.5787  .4921  .4375  .4856 

. 4702 

.5402  j 

t 

DISTANCE  MftTRIX  HAS  BEEN  NORMALIZED 

BY 

.7402 

j c 

ORDERED  DISTANCE 

TO  EACH  PR3T0TYP 

symb 

A 

4 

S 8 D W 

Q 

< 6 1 

ft  0. 

0000  .4441  .4860  .5021  .5107  .5112  .5117  .51 72  .5199  1 

SYMB 

B 

D 

2 Z E 8 

H 

S 3 1 

3 0 . 

0 000  . 

3112  . 

3490  .3851  .3906  .4179  . 

4344  .4515  .451! 

£ymb 

c 

0 

E 4 5 5 

8 

S H 

: o. 

0000  . 

4*518  . 

4602  .4938  .5011  .5420  • 

5593  . 

5676  .5787 

SY  MB 

D 

B 

8 G 2 E 

M 

0 S 

a c« 

9000  • 

3112  . 

4342  .4466  .4473  .4695  . 

4921  . 

5014  .5017 

SYMB 

E 

9 

8 Z W 0 

C 

0 5 

l o. 

0000  . 

3906  . 

4229  .4344  .4375  .4561  . 

460  2 . 

4695  . 4725 

SYN3 
F 0 

F 

.0000  • 

o 

2985  . 

R 5 B 2 

4204  .4486  .4553  .4748  . 

K 

4782  . 

M H 

4856  .4900 

Fig.  8.  Example  Distance  Matrix  and  Nearest  Symbol  Kanklng. 
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New  Symbol  6 from  Gram-Schmidt  Orthogonal lsatlon. 
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The  tiram-Schmldt  orthogonal lsat ion  process  was  applied  to  the 
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FSFC  data,  the  Inverse  transform  In  Figure  4 represents  data  symbol  6. 
Figure  9,  the  new  orthogonal-to-Z,  symbol  6 Is  shown  on  the  proceeding 
page.  The  third  symbol  was  not  recognisable  as  an  alphanumeric  symbol. 

The  flying  spot  data  distance  matrix  was  ordered  for  each 
symbol  and  the  symbol  closest  neighbor  families  was  observed  for 
each  alphanumeric  set  as  shown  In  Table  III  below. 


Table  III 

Huddleston  Symbol  Families 


Kabler  Data.  The  slse  -mod  1 f led  Kabler  data  was  normalised  by  the 
first  (DC)  term  to  observe  component  ratios  and  the  rank  order  of  the 
largest  me an- to- variance  for  each  symbol  component  was  determined,  see 
Table  IV  on  the  next  page.  The  statistics  for  the  symbol  components 
is  included  In  Table  IV  also.  Squaring  the  raw  data  Increased  the 
mean-to-varlance  ratio,  but  the  rank  order  remained  the  same. 

Constructed  Data.  The  minimum  and  maximum  distance  of  separation 
for  the  constructed  data  is  presented  In  Table  V.  A distribution  of 
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Table  IV 

Kabler  Data  Normal lzed  By  DC  term  Statistics 


r 
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Mean-to-Varlance 


Symbol 

Component 

Mean 

Variance 

Ratio 

A 

9 

-.4561 

.0195 

-23.4 

B 

10 

-.3194 

.0128 

-24.9 

C 

21 

.6422 

.0487 

13.2 

D 

17 

.3482 

.0223 

15.6 

E 

16 

.2011 

.0149 

23.1 

F 

29 

-.1404 

.0176 

-25.6 

G 

23 

-.5201 

.0313 

-16.6 

H 

31 

.2340 

.0109 

21.5 

I 

24 

-.2052 

.0029 

-70.3 

J 

23 

-.3251 

.0086 

-37.9 

K 

10 

-.4413 

.0152 

-29.1 

L 

24 

.3771 

.0149 

-25.3 

M 

9 

-.4088 

.0243 

• 16  • 8 

N 

23 

-.2296 

.0080 

-28.6 

0 

42 

.2708 

.0244 

11.1 

P 

21 

.2504 

.0086 

29.2 

Q 

37 

-.1333 

.0152 

- 8.7 

R 

10 

-.3616 

.0134 

-26.9 

S 

10 

-.4075 

.0174 

23.4 

T 

13 

.2636 

.0115 

27.8 

U 

24 

.2978 

.0229 

-13.0 

V 

23 

-.1462 

.0064 

23.0 

w 

10 

-.4928 

.0333 

-14.8 

X 

21 

.4453 

.0236 

18.8 

Y 

10 

-.4531 

.0221 

-20.5 

z 

9 

-.4126 

.0324 

-12.7 

Mean* to- Variance  Ratio:  High  - 70.3,  Low  - 0.1 

Number  of  Samples  - 113 

Kabler  Data  Normalized  by  first  (DC)  term. 

Mean-to-Vartance  Ratio 


Symbol 

Largest  Component  Order 

Range 

E 

16  15  11  23  9 21  10 

23-13 

P 

9 29  23  21  4 16 

27-13 

P 

9 21  24  29  47  8 

33-15 

B 

10  23  24  9 42  21 

27-18 

C 

24  23  23  21  11  41 

17-12 

q 

24  13  14  21  2325 

9-  6 

0 

42  41  23  24  47  22 

11-  7 

u 

24  23  31  17  18  37 

13-  8 
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Constructed 

Table  V 

Data  Distances  of  Separation 

Type  Symbol 

Distance 

of  Separation 

Set 

Minimum 

Maximum 

ASCII 

.2022 

1.5605 

NAMEL 

.2022 

1.5512 

Huddleston 

.4116 

1.5527 

Lincoln/Mitre 

.4328 

1.5752 

Combination  of 

4 .4694 

1.5752 

Experimental 

.4032 

1.5092 

Combination  of 

5 .5616 

1.5729 

Table  VI 

Set  Distance  Histogram  Data 

Number  of  Distances 
In  Range  Bin 

Symbol^ange 

Set  V Bln  .25  .50  .75 

ASCII 

1 4 » 52 

l 254  ; 

NAMEL 

2 30 

174 

Huddleston 

0 16 

210 

Llncoln/Mi tre 

0 14 

188 

Combination 

0 6 

144 

Experimental 

0 10 

156 

Combination 

0 0 

98 

Table  VII 

Digit  Set  Minimum  Distances 

Tvi>e  of  Digits 

Minimum 
Digit  Pair 

Distance  of 
Separation 

Experimental 

3-5 

.6390  | 

Llncoln/Mltre 

2-8 

.6255 

Mound 

3-8 

.6271 

Mackvorth 

3-5 

.6102 

Lansdell 

3-8 

.6042 

NAMEL 

5-6 

.5983 

Combination 

5-0 

.5734 

Huddleston 

6-8 

.4512 

ASCII 

0.8 

.3016 
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distances  Is  presented  In  Table  end  the  predicted  legibility  order 
of  the  tested  sets  Is  as  fellows:  the  test  set,  Llncoln/Mltre  set, 

Huddleston  set,  ASCII  set  end  NAMEL  set.  The  predicted  legibility  of 
the  numeric  digits  was  calculated  and  presented  In  Table  VII,  Digit 
Set  Minimum  Distances,  on  the  preceedlng  page. 

Psychophysical  Testing.  The  psychophysical  testing  results  are 
presented  In  Figure  10,  Psychophysical  Symbol  Legibility  Test  Results, 
on  the  following  page.  On  the  following  pages,  the  number  of  Incorrect 
Identification  errors  versus  the  distance  of  separation  rank  order  of 
closest  neighbors  is  shown  In  Figure  11,  Number  of  Errors  versus  Rank 
Order-Distance  Number.  In  Figure  12  Errors  versus  Distance  of  Separation 
for  a Symbol  Pair,  the  results  of  Increasing  the  distance  of  separation 
on  a given  symbol,  C and  G,  are  shown  as  the  number  of  errors  decrease. 
Similar  results  occur  for  symbol  pair  0-Q. 

The  non-symmetrlc  confusion  matrix  was  constructed  from  subject 
responses  and  the  order  of  least  number  of  errors  after  the  fifth  day 
was  predicted:  the  test  set,  the  Llncoln/Mltre  set,  the  Huddleston 

set,  the  NAMEL  set  and  the  ASCII  set.  The  number  of  errors  In  the 
confusion  matrix  that  are  greater  than  ten  are  presented  in  Table  VII, 
Confusion  Matrix  Error  Pairs.  An  analysis  of  variance  calculation  was 
performed,  which  produced  a .01  significance  level  for  the  test  results. 

Conclusions 

The  primary  objective  of  the  Investigation  of  alphanumeric 
symbol  legibility  was  to  design  an  alphanumeric  symbol  set  by  using 
Fourier  spatial  frequency  components  and  test  that  set  against  tho 
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NUMBER  OF  ERRORS 


Table  VIII 

Confusion  Matrix  Error  Pairs 


i * 


Distance 

Set  Total 

Symbol  Set 

Pair 

Errors 

Rank  Order 

Errors 

ASCII 

C-G 

12 

1 

392 

1-1 

12 

1 

P-R 

13 

2 

S-8 

10 

1 

3-8 

12 

2 

NAMEL 

I-l 

36 

2 

404 

0-0 

19 

2 

v-x 

16 

4 

C-G 

15 

1 

Huddleston 

G-6 

18 

1 

379 

P-R 

18 

3 

Lincoln/Mitre 

2-9 

19 

4 

259 

T-8 

13 

13 

I-l 

12 

2 

P-R 

12 

1 

Test 

S-8 

14 

6 

279 

1-9 

12 

3 

P-R 

10 

3 

legibility  of  the  ASCII r Huddleston,  Lincoln/Mitre  and  NAMEL  alpha- 
numeric sets.  To  meet  this  objective  and  predict  legibility  it  was 
necessary  to  find  the  Fourier  spatial  frequency  components  of  several 
alphabetic  and  numeric  sets  and  compute  the  Euclidean  distance  of 
separation  in  FSFC  space. 

The  assumptions  stated  in  Chapter  II  were  justified  by  the 
Inverse  Fourier  transformation  of  the  FSFC,  see  Figure  4,  and  small 
changes  in  some  of  the  components  of  widely  varying  symbols,  see 
Table  IV.  The  Gram-Schmldt  process  can  generate  new  separated 
symbols,  but  the  symbols  are  not  recognizable  as  alphanumeric,  see 
Figure  9, 

It  is  concluded  that  the  objective  of  this  investigation  was 
obtained  by  using  the  first,  second  and  third  harmonic  terms  as  a 
basis  for  a FSFC  feature  space,  and  use  of  maximized  Euclidean 
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distance  of  separation  to  Increase  syabol  pair  legibility.  Euclidean 
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distance  in  FSFC  space  la  a metric  for  legibility.  The  legibility 
of  symbol  sets  can  be  predicted  by  rank  ordering  them  by  maximum- 
minimum  distance  of  separation  and  the  histogram  of  the  distance 
distribution,  see  Table  V and  VI. 

Families  of  similar  symbols  can  be  constructed  by  rank  ordering 
the  distance  of  separation,  see  Table  III,  and  the  FSFC  terms  of  a 
family  have  similar  large  mean-to-var lance  ratios  for  samples  of 
different  alphabetic  sets.  From  Table  IV,  the  tenth  component  has  a 
mean-to-var lance  ratio  of  greater  than  25  for  the  family  B,  R,  K,  and  Y. 
The  value  of  the  tenth  component  increases  In  value  across  the  family. 

For  a given  symbol  pair,  the  number  of  recognition  errors  can  be 
decreased  as  the  distance  of  separation  of  the  symbols  Is  increased, 
see  Figure  12.  The  recognition  errors  are  the  largest  with  the 
five  closest  symbols  to  a confused  symbol,  as  shown  In  Figure  11 
and  Table  VIII. 
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V.  Recommcndatl ons 


Although  the  results  of  this  Investigation  are  statistically 
significant,  in  the  testing  phase  only  five  subjects  were  used  in 
the  9,000  tests.  The  rising  curves  in  Figure  10,  Psychophys leal 
Symbol  Legibility  Test  Results,  might  level  off  and  converge  as 
the  observers  learn  the  symbols,  as  increased  symbols  are  observed. 

It  is  recommended  that  the  combination  set  designed  with  a 
maximum-minimum  distance  of  separation  of  .5616  be  tested  by  ten 
observers  using  18,000  observations.  It  is  also  suggested  that  the 
first  9,000  tests  be  made  using  a five  millisecond  observation 
time  and  the  final  9,000  tests  using  15  milliseconds  as  the  symbol 
exibltion  time. 

It  is  suggested  that  the  height  and  width  of  the  symbol  should 
be  allowed  to  change.  Since  even  a one  point  change  in  the  digital 
representation  of  the  symbol  changes  the  FSFC  terms,  the  line  width 
should  be  allowed  to  change  and  a variation  in  the  digitization 
resolution  should  be  investigated  to  find  the  boundaries  of  the 
region  a symbol  lies  in  the  FSFC  space,  but  the  symbol  line  intensity 
should  not  be  allowed  to  vary. 

lhis  investigation  prediced  the  legibility  of  the  experimental, 
Lincoln/Mitre,  Mound,  Mackworth,  Lansdeli,  NAM£L,  Huddleston  and 
ASCII  numeric  sets.  It  is  recommended  that  the  legibility  of  these 
sets  be  psychophys 1 cal ly  tested.  It  is  suggested  that  the  Braille 
and  American  Banking  Association  magnetic  checking  digits  be  included 
in  the  study. 
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the  Investigation  used  a square,  seven-by-seven,  filter;  It  Is 
suggested  that  the  first  harmonic  term  be  not  used  because  It  Is  a 
measure  of  the  number  of  units  used  to  construct  the  symbol.  Use  of 
the  second,  third  and  fourth  harmonic  terms  Is  recommended  and  a 
rectangular  filter  be  constructed  to  contain  horitontal  terms  In 
a greater  proportion  to  the  vertical  terms  because  of  the  width 
changes  In  a symbol  from  set  to  set. 
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88 

88 
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8888  8888 
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8388  88 

88 
83 
33 
38 
88 
88 

8838  88 

8388  838 
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